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Abstract 


A  string  w  covers  another  string  z  if  every  symbol  of  z  is  within  some  occurrence 
of  w  in  ...  A  string  is  called  superprimitive  if  it  is  covered  only  by  itself,  and  quasiperi- 
odic  if  it  is  covered  by  some  shorter  string.  This  paper  presents  an  O(loglogn)  time 
IogU^'n -Processor  CRCW-PRAM  algorithm  that  tests  if  a  string  is  superprimitive.  The 
algorithm  is  the  fastest  possible  with  this  number  of  processors  over  a  general  alphabet. 


1  Introduction 

Quasiperiodicity,  as  defined  by  Apostolico  and  Ehrenfeucht  [3],  is  an  avoidable  regularity 
of  strings  that  is  strongly  related  to  other  regularities  such  as  periods  and  squares  [12]. 
Apostolico.  Farach  and  Iliopoulos  [4]  and  Breslauer  [7]  gave  linear-time  sequential  algorithms 
that  tests  if  a  string  is  superprimitive.  Apostolico  and  Ehrenfeucht  [3]  presented  an  algorithm 
that  finds  all  maximal  quasiperiodic  substrings  of  a  string. 

This  paper  presents  a  parallel  algorithm  that  tests  if  a  string  of  length  n  is  superprimitive 
in  O(loglogn)  time  on  a  j—^^-processor  CRCW-PRAM.  This  is  the  first  efficient  parallel 
algorithm  for  this  problem,  ^he  algorithm  works  under  the  general  alphabet  assumption 
where  the  only  access  it  has  to  the  input  string  is  by  pairwise  comparisons  of  symbols. 

A  parallel  algorithm  is  said  to  achieve  an  optimal  speedup  if  its  time-processor  product 
is  the  same  as  the  running  time  of  the  fastest  sequential  algorithm  for  the  problem.  We 
show  that  any  -processor  parallel  superprimitivity  testing  algorithm  over  a  general 

alphabet  must  take  at  least  Q(loglogra)  time.  Thus,  the  algorithm  presented  in  this  paper 
is  a  factor  of  logn  processors  away  from  optimality.  Note  that  there  exists  a  trivial  constant 
time  superprimitivity  testing  algorithm  that  uses  n 2  processors. 

The  superprimitivity  testing  algorithm  follows  techniques  that  were  used  in  solving  sev¬ 
eral  other  parallel  string  problems  [1,  2.  6,  10].  In  particular,  it  uses  the  parallel  string  match¬ 
ing  algorithm  of  Breslauer  and  Galil  [8]  as  a  procedure  that  solves  several  string  matching 
problems  simultaneously  and  then  combines  the  results  of  the  string  matching  problems  into 
an  answer  to  the  superprimitivity  problem.  _ _ _ 
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1  1 


The  paper  is  organized  as  follow.  Section  2  gives  basic  definitions  and  properties  of 
strings.  Section  3  overviews  the  parallel  algorithms  and  tools  that  are  used  in  the  su¬ 
per primiti vity  testing  algorithm.  Section  4  describes  the  basic  step  which  is  used  by  the 
snperprimitivity  testing  algorithm  in  Section  5.  Section  6  gives  the  lower  bound. 


2  Properties  of  Strings 

A  string  iv  covers  a  string  z  if  for  every  position  i  6  { 1,  •  •  ■  ,  |z|}  of  z  there  exists  an  occurrence 
of  w  starting  at  some  position  j  of  £  such  that  1  <  j  <  t  <  j  +  |tn|  —  1  <  |z|.  A  string 
z  is  called  superprimitive  if  it  is  covered  only  by  itself,  and  quasiperiodic  if  it  is  covered  by 
a  string  w  such  that  w  ^  z.  A  superprimitive  string  to  that  covers  a  string  z  is  called  a 
quasiperiod  of  z. 

A  string  z  has  a  period  w  if  z  is  a  prefix  of  wk  for  some  integer  k.  Alternatively,  a  string 
iv  is  a  period  of  a  string  z  if  z  —  wlv  and  v  is  a  possibly  empty  prefix  of  to.  The  shortest 
period  of  a  string  z  is  called  the  period  of  z.  Clearly,  a  string  is  always  a  period  of  itself. 

We  say  that  a  non-empty  string  to  is  a  border  of  a  string  z  if  z  starts  and  ends  with  an 
occurrence  of  to.  That  is,  z  =  uw  and  z  =  wv  for  some  possibly  empty  strings  u  and  v. 
Clearly,  a  string  is  always  a  border  of  itself.  This  border  is  called  the  trivial  border. 

YY'e  describe  next  few  simple  facts  which  are  used  in  the  superprimitivity  testing  algo¬ 
rithm.  Most  of  these  facts  were  used  in  the  sequential  algorithms  [4,  7]  where  their  proofs 
can  be  found. 

Fact  2.1  A  string  z  has  a  period  of  length  n,  such  that  tv  <  |z|,  if  and  only  if  it  has  a 
non-trivial  border  of  length  |z|  —  tt. 

Fact  2.2  If  a  string  to  covers  a  string  z  then  to  is  a  border  of  z. 

Note  that  by  the  last  fact  any  cover  of  a  string  z  can  be  represented  by  a  single  integer 
that  is  the  length  the  border  of  z. 

Fact  2.3  If  a  string  to  covers  a  string  z  and  another  string  v,  such  that  |io|  <  |v|,  is  a  border 
of  z  then  iv  covers  also  v. 

Fact  2.4  If  a  string  z  is  covered  by  two  strings  to  and  v,  such  that  |to|  <  }v\,  then  w  covers 
r.  Therefore .  a  string  cannot  have  two  different  quasiperiods  w  and  v. 

Fact  2.5  If  a  string  z  has  a  border  w,  such  that  2|to|  >  |z|,  then  w  covers  z. 

Proof:  The  string  to  covers  the  first  half  of  the  string  z  since  it  is  a  prefix  of  z  and  the  last 
half  of  the  string  z  since  it  is  also  a  suffix.  Therefore,  all  symbols  of  z  are  covered  by  to.  □ 
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3  The  CRCW-PRAM  Model 


The  algorithms  described  in  this  paper  are  for  the  concurrent-read  concurrent  write  parallel 
random  access  machine  model.  We  use  the  weakest  version  of  this  model  called  the  common 
CRCW-PRAM.  In  this  model  many  processors  have  access  to  a  shared  memory.  Concurrent 
read  and  write  operations  are  allowed  at  all  memory  locations.  If  several  processors  attempt 
to  write  simultaneously  to  the  same  memory  location,  it  is  assumed  they  always  attempt  to 
write  the  same  value. 

The  superprimitivity  testing  algorithm  uses  the  following  previously  known  algorithms: 

1.  A  parallel  string  matching  algorithm  that  finds  all  occurrences  of  a  given  pattern  in 
a  given  text.  The  input  to  the  string  matching  algorithm  consists  of  two  strings: 
■patter n[\..m\  and  text[l..n]  and  the  output  is  a  Boolean  array  match[\..n\  that  has 
a  “  true  ”  value  in  each  position  where  an  occurrence  of  the  pattern  starts  in  the 
text.  We  use  the  Breslauer  and  Galil  [8]  parallel  string  matching  algorithm  that  takes 
O(loglogm)  time  on  a  t  ^  m -processor  CRCW-PRAM.  This  algorithm  is  the  fastest 
optimal  parallel  string  matching  algorithm  on  a  general  alphabet  as  implied  by  a  lower 
bound  of  Breslauer  and  Galil  [9]. 

2.  The  parallel  algorithm  of  Breslauer  and  Galil  [10]  that  finds  all  periods  of  a  string  of 
length  n  in  O(loglogn)  time  on  a  |o-^  — -processor  CRCW-PRAM.  The  output  of  this 
algorithm  is  a  Boolean  array  periods[\ . .n]  that  has  a  “frue”  value  at  each  position 
which  is  a  period  of  the  input  string. 

3.  The  algorithm  of  Fich,  Ragde  and  Wigderson  [11]  to  compute  the  minimum  of  n 
integers  between  1  and  n  in  constant  time  using  an  n-processor  CRCW-PRAM. 

One  of  the  major  issues  in  the  design  of  a  PRAM  algorithms  is  the  assignment  of  proces¬ 
sors  to  their  tasks.  We  ignore  this  issue  in  this  paper  and  use  a  general  theorem  that  states 
that  the  assignment  can  be  done. 

Theorem  3.1  (Brent.  [5])  Any  synchronous  parallel  algorithm  of  time  t  that  consists  of  a 
total  of  x  elementary  operations  can  be  implemented  on  p  processors  in  [x/p]  +  t  time. 


4  The  Basic  Step 

This  section  shows  how  to  test  efficiently  whether  a  given  string  w  covers  another  string  z. 

Theorem  4.1  Given  two  string  :  and  w,  there  exists  an  algorithm  that  tests  whether  w 
co vi  rs  z  in  O(log  log  ]r|)  time  on  a  [og{^ processor  CRCW-PRAM. 


Proof:  The  algorithm  has  two  steps: 


1.  Using  Breslauer  and  Galil’s  [8]  string  matching  algorithm,  find  all  occurrences  of  w  in 
2.  This  step  takes  (9(log  log  |z|)  time  and  uses  lo'  |^[a[  processors. 

2.  Using  Fich,  Ragde  and  Wigderson's  [11]  integer  minima  algorithm  verify  that  each 
symbol  of  2  is  within  an  occurrence  of  w.  This  step  takes  constant  time  and  uses  |z| 
processors.  It  can  be  done  as  follows: 

The  string  |z|  is  partitioned  into  consecutive  blocks  of  length  |u>|.  The  computation 
proceeds  simultaneously  in  each  block. 

The  position  in  z  of  the  first  and  last  occurrences  of  w  in  each  block  are  found  using 
the  using  Fich,  Ragde  and  Wigderson’s  [11]  integer  minima  algorithm.  All  symbols  of 
z  which  are  between  the  first  and  last  occurrences  of  w  in  the  same  block  are  obviously 
covered.  All  that  remains  to  check  is  whether  the  symbols  between  the  last  occurrence 
of  w  in  each  block  and  the  first  occurrence  of  w  in  the  next  block  are  also  covered  by 
testing  if  the  distance  between  these  occurrences  is  smaller  than  or  equal  to  |ti>|.  A 
special  attention  is  needed  for  the  first  and  last  blocks  where  the  algorithm  checks  is 
an  occurrence  starts  at  positions  number  1  and  \z\  —  |tu|  +  1  of  2.  □ 

5  The  Superprimitivity  Test 

This  section  describes  the  parallel  superprimitivity  test  algorithm. 

Theorem  5.1  There  exists  an  algorithm  that  computes  the  quasiperiod  of  a  string  z  in 

(2(Iog  log  |z|)  time  on  a  -processors  CRCW-PRAM. 

Proof:  The  algorithm  consists  of  four  steps. 

1.  Compute  all  borders  of  z  using  Breslauer  and  Galil’s  [10]  algorithm  that  finds  all 
periods  of  a  string.  Recall  that  by  Fact  2.2  if  w  covers  z  then  w  must  be  a  border  of 
2  and  by  Fact  2.1  there  is  a  one-to-one  correspondence  between  the  borders  and  the 
periods  of  a  string. 

2.  Partition  the  borders  of  2  into  intervals  [2'..2,+1  —  1]  according  to  their  length.  If  there 
is  more  than  one  borders  of  2  whose  length  is  in  the  same  interval  then  by  Fact  2.5 
the  shorter  border  covers  the  longer  one.  By  Fact  2.4  only  the  shortest  border  in  each 
interval  is  a  candidate  for  the  quasiperiod  of  2.  The  shortest  border  in  each  interval  can 
be  found  in  constant  time  and  |z|  by  using  Fich,  Ragde  and  Wigderson's  [11]  integer 
minima  algorithm  in  each  block  simultaneously. 

:i.  In  each  interval  simultaneously,  check  if  the  shortest  border  in  the  interval  covers  2. 

4.  The  shortest  border  that  was  found  to  covers  2  is  the  quasiperiod  of  2.  □ 


6  The  Lower  Bound 


YVe  prove  a  lower  bound  for  testing  if  a  string  is  superprimitive  by  a  reduction  to  the  lower 
bound  for  string  matching  by  Breslauer  and  Galil  [9].  That  lower  bound  is  on  the  number  of 
comparison  rounds  an  algorithm  that  computes  the  period  of  a  string  has  to  perform  when 
there  are  p  comparisons  in  each  round.  The  lower  bounds  holds  for  the  CRCW-PRAM  model 
in  case  of  a  general  alphabet  where  the  only  access  an  algorithm  has  to  the  input  strings  is 
by  pairwise  comparisons  of  symbols. 

Breslauer  and  Galil  [9]  show  that  an  adversary  can  fool  any  algorithm  which  claims  to 
test  if  a  string  has  a  period  which  is  shorter  than  half  of  its  length  in  less  than  + 

log  log [-1+  .  I  2p)  rounds  of  p  comparisons  each.  Without  going  into  the  detail  of  that  lower 
bound,  we  use  the  fact  that  the  adversary  of  Breslauer  and  Galil  [9]  answers  the  comparisons 
in  each  round  in  such  a  way  that  after  4-  loglogf1+p/ni  2p)  rounds  it  is  still  possible 

that  the  input  string  has  a  period  that  is  shorter  than  half  of  its  length  or  that  is  does  not 
have  any  such  period.  In  the  latter  case  there  is  at  least  one  symbol  of  the  string  that  does 
not  appear  anywhere  else. 

Lemma  6.1  The  string  generated  by  Breslauer  and  Galil’s  [9]  adversary  is  superprimitive 
if  and  only  if  it  does  not  have  a  period  that  is  shorter  than  half  of  its  length. 

Proof:  If  a  siring  has  a  period  that  is  shorter  than  half  of  its  length,  then  by  Fact  2.1  it 
has  a  border  that  is  longer  than  half  of  its  length  and  by  Fact  2.5  is  quasiperiodic.  On  the 
other  hand,  if  a  string  has  a  symbol  that  appears  only  once,  then  it  is  superprimitive.  □ 

Theorem  6.2  Any  comparison  based  parallel  string  superprimitivity  test  with  p  comparisons 
in  each  round  must  take  at  least  +  log  log(-1+p/rni  2p)  rounds. 

Proof:  By  Lemma  6.1  the  lower  bound  of  Breslauer  and  Galil  [9]  holds  also  for  superprim¬ 
itivity  testing.  □ 

Corollary  6.3  The  algorithm  described  in  Section  5  is  the  fastest  possible  with  the  number 
of  processors  used. 

Proof:  Substitute  p  =  j^osgn-  in  Theorem  6.2.  □ 
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